Summary of Text Feature Selection Method Based on VSM
نویسندگان
چکیده
منابع مشابه
The Feature Selection Method based on Genetic Algorithm for Efficient of Text Clustering and Text Classification
Big Data means a very large amount of data and includes a range of methodologies such as big data collection, processing, storage, management, and analysis. Since Big Data Text Mining extracts a lot of features and data, clustering and classification can result in high computational complexity and the low reliability of the analysis results. In particular, a TDM (Term Document Matrix) obtained ...
متن کاملA Knowledge-Based Feature Selection Method for Text Categorization
A major difficulty of text categorization is the high dimensionality of the original feature space. Feature selection plays an important role in text categorization. Automatic feature selection methods such as document frequency thresholding (DF), information gain (IG), mutual information (MI), and so on are commonly applied in text categorization. Many existing experiments show IG is one of th...
متن کاملA Novel One Sided Feature Selection Method for Imbalanced Text Classification
The imbalance data can be seen in various areas such as text classification, credit card fraud detection, risk management, web page classification, image classification, medical diagnosis/monitoring, and biological data analysis. The classification algorithms have more tendencies to the large class and might even deal with the minority class data as the outlier data. The text data is one of t...
متن کاملA Text Classifier Based on Sentence Category VSM
VSM is a mature model of text representation for categorization. Words are commonly used as dimensions of feature space of VSM, but words only provide little semantic information. Sentence category theory is an important component of HNC theory and can provide abundant information about meaning, structure and style of a sentence. We use sentence categories as dimensions of feature space, reduce...
متن کاملMMR-based Feature Selection for Text Categorization
We introduce a new method of feature selection for text categorization. Our MMR-based feature selection method strives to reduce redundancy between features while maintaining information gain in selecting appropriate features for text categorization. Empirical results show that MMR-based feature selection is more effective than Koller & Sahami’s method, which is one of greedy feature selection ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: DEStech Transactions on Social Science, Education and Human Science
سال: 2017
ISSN: 2475-0042
DOI: 10.12783/dtssehs/icss2016/9199